Slovak Morphosyntactic Tagset
نویسندگان
چکیده
منابع مشابه
Slovak Morphosyntactic Tagset
Morphological annotation constitutes essential, very useful and very common linguistic information presented in corpora, especially for highly inflectional languages. The morphological tagset used in the Slovak National Corpus has been designed with several goals in mind – the tags are compact and easily human-readable, without sacrificing their informational contents. The tags consist of ASCII...
متن کاملThe Development of a Morphosyntactic Tagset for Afrikaans and its Use with Statistical Tagging
In this paper, we present a morphosyntactic tagset for Afrikaans based on the guidelines developed by the Expert Advisory Group on Language Engineering Standards (EAGLES). We compare our slim yet expressive tagset, MAATS (Morphosyntactic AfrikAans TagSet), with an existing one which primarily focuses on a detailed morphosyntactic and semantic description of word forms. MAATS will primarily be u...
متن کاملReusable Tagset Conversion Using Tagset Drivers
Part-of-speech or morphological tags are important means of annotation in a vast number of corpora. However, different sets of tags are used in different corpora, even for the same language. Tagset conversion is difficult, and solutions tend to be tailored to a particular pair of tagsets. We propose a universal approach that makes the conversion tools reusable. We also provide an indirect evalu...
متن کاملBottom Up Tagset Design from Maximally Reduced Tagset
For highly innectional languages, where the number of morpho-syntactic descriptions (MSD) is very high, the use of a reduced tagset is crucial for reasons of implementation problems as well as the problem of sparse data. The standard procedure is to start from the large set of MSDs incorporating all morphosyntactic features and design a reduced tagset by eliminating the attributes which play no...
متن کاملRule-based Tagging: Morphological Tagset versus Tagset of Analytical Functions
This work presents a part of a more global study on the problem of parsing of Czech and on the knowledge extraction capabilities of the Rule-based method. It is shown that the successfulness of the Rule-based method for English and its unsuccessfulness for Czech, is not only due to the small cardinality of the English tagset (as it is usually claimed) but mainly depends on its structure (”regul...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Language Modelling
سال: 2012
ISSN: 2299-8470,2299-856X
DOI: 10.15398/jlm.v0i1.35